Using Dynamic Rewards to Learn a Fully Holonomic Bipedal Walk

نویسندگان

  • Patrick MacAlpine
  • Peter Stone
چکیده

This paper presents the design and learning architecture for a fully holonomic omnidirectional walk used by the UT Austin Villa humanoid robot soccer agent acting in the RoboCup 3D simulation environment. By “fully holonomic” we mean the walk allows for movement in all directions with equal velocity. The walk is based on a double linear inverted pendulum model and was originally designed for the actual physical Nao robot. Parameters for the walk are optimized for maximum speed and stability while at the same time a novel approach of reweighting rewards for walking speeds in the cardinal directions of forwards, backwards, and sideways is utilized to promote equal walking velocities in all directions. A variant of this walk which uses the same walk engine, but is not fully holonomic as it employs three different sets of learned walk parameters biased toward maximizing forward walking speed, was the crucial component in the UT Austin Villa team winning the 2011 RoboCup 3D simulation competition. Detailed experiments reveal that adaptively changing the weights of rewards over time is an effective method for learning a fully holonomic walk. Additional data shows that a team of agents using this learned fully holonomic walk is able to beat other teams, including that of the 2011 RoboCup 3D simulation champion UT Austin Villa team, that utilize non-fully holonomic walks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gait Generation for a Bipedal System By Morris-Lecar Central Pattern Generator

The ability to move in complex environments is one of the most important features of humans and animals. In this work, we exploit a bio-inspired method to generate different gaits in a bipedal locomotion system. We use the 4-cell CPG model developed by Pinto [21]. This model has been established on symmetric coupling between the cells which are responsible for generating oscillatory signals. Th...

متن کامل

From Passive to Active Dynamic 3D Bipedal Walking

Applying an evolutionary algorithm, we first develop the morphology of a simulated passive dynamic bipedal walking device, able to walk down a shallow slope. Using the resulting morphology and adding minimal motor and sensory equipment, a neural controller is evolved, enabling the walking device to walk on a flat surface with minimal energy consumption. The applied evolutionary algorithm fixes ...

متن کامل

From Passive to Active Dynamic 3D Bipedal Walking – An Evolutionary Approach

Applying an evolutionary algorithm, we first develop the morphology of a simulated passive dynamic bipedal walking device, able to walk down a shallow slope. Using the resulting morphology and adding minimal motor and sensory equipment, a neural controller is evolved, enabling the walking device to walk on a flat surface with minimal energy consumption. The applied evolutionary algorithm fixes ...

متن کامل

Powered bipedal robots based on unpowered walking toys

Unpowered bipedal devices which walk down gentle slopes have been in development, initially as toys, for more than one hundred years. Here we describe three powered robots based on these ramp-walking toys. Humanoid walking robots usually rely heavily on complicated feedback control strategies, have a limited learning ability, have gaits that look somewhat artificial, and use relatively more ene...

متن کامل

First Steps Toward Supervised Learning for Underactuated Bipedal Robot Locomotion, with Outdoor Experiments on the Wave Field

Supervised learning is used to build a control policy for robust, dynamic walking of an underactuated bipedal robot. The training and testing sets consist of controllers based on a full dynamic model, virtual constraints, and parameter optimization to meet torque limits, friction cone, and environmental conditions. The controllers are designed to induce periodic walking gaits at various speeds,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012